7 research outputs found

    Aleatoric Uncertainty Modelling in Regression Problems using Deep Learning

    Get PDF
    [eng] Nowadays, we live in an intrinsically uncertain world from our perspective. We do not know what will happen in the future but, to infer it, we build the so-called models. These models are abstractions of the world we live which allow us to conceive how the world works and that are, essentially, validated from our previous experience and discarded if their predictions prove to be incorrect in the future. This common scientific process of inference has several non-deterministic steps. First of all, our measuring instruments could be inaccurate. That is, the information we use a priori to know what will happen may already contain some irreducible error. Besides, our past experience in building the model could be biased (and, therefore, we would incorrectly infer the future, as the models would be based on unrepresentative data). On the other hand, our model itself may be an oversimplification of the reality (which would lead us to unrealistic generalizations). Furthermore, the overall task of inferring the future may be downright non-deterministic. This often happens when the information we have a priori to infer the future is incomplete or partial for the task to be performed (i.e. it depends on factors we cannot observe at the time of prediction) and we are, consequently, obliged to consider that what we want to predict is not a deterministic value. One way to model all of these uncertainties is through a probabilistic approach that mathematically formalizes these sources of uncertainty in order to create specific methods that capture them. Accordingly, the general aim of this thesis is to define a probabilistic approach that contributes to artificial intelligence-based systems (specifically, deep learning) becoming robust and reliable systems capable of being applied to high-risk problems, where having generic good performance is not enough but also to ensure that critical errors with high costs are avoided. In particular, the thesis shows the current divergence in the literature - when it comes to dividing and naming the different types of uncertainty - by proposing a procedure to follow. In addition, based on a real problem case arising from the industrial nature of the current thesis, the importance of investigating the last type of uncertainty is emphasized, which arises from the lack of a priori information in order to infer deterministically the future, the so-called aleatoric uncertainty. The current thesis delves into different literature models in order to capture aleatoric uncertainty using deep learning and analyzes their limitations. In addition, it proposes new state-of-the-art approaches that allow to solve the limitations exposed during the thesis. As a result of applying the aleatoric uncertainty modelling in real-world problems, the uncertainty modelling of a black box systems problem arises. Generically, a Black box system is a pre-existing predictive system which originally do not model uncertainty and where no requirements or assumptions are made about its internals. Therefore, the goal is to build a new system that wrappers the black box and models the uncertainty of this original system. In this scenario, not all previously introduced aleatoric uncertainty modelling approaches can be considered and this implies that flexible methods such as Quantile Regression ones need to be modified in order to be applied in this context. Subsequently, the Quantile Regression study brings the need to solve one critical literature problem in the QR literature, the so-called crossing quantile, which motivates the proposal of new additional models to solve it. Finally, all of the above research will be summarized in visualization and evaluation methods for the predicted uncertainty to produce uncertainty-tailored methods.[cat] Estem rodejats d’incertesa. Cada decisió que prenem té una probabilitat de sortir com un espera i, en funció d’aquesta, molts cops condicionem les nostres decisions. De la mateixa manera, els sistemes autònoms han de saber interpretar aquests escenaris incerts. Tot i això, actualment, malgrat els grans avenços en el camp de la intel·ligència artificial, ens trobem en un moment on la incapacitat d'aquests sistemes per poder identificar a priori un escenari de major risc impedeix la seva inclusió com a part de solucions que podrien revolucionar la societat tal i com la coneixem. El repte és significatiu i, per això, és essencial que aquests sistemes aprenguin a modelar i gestionar totes les fonts de la incertesa. Partint d'un enfocament probabilístic, aquesta tesi proposa formalitzar els diferents tipus d'incerteses i, en particular, centra la seva recerca en un tipus anomenada com incertesa aleatòrica, ja que va ser detectada com la principal incertesa decisiva a tractar en el problema financer original que va motivar el present doctorat industrial. A partir d'aquesta investigació, la tesi proposa nous models per millorar l'estat de l'art en la modelització de la incertesa aleatòrica, així com introdueix un nou problema, a partir d’una necessitat real industrial, que apareix quan hi ha un sistema predictiu en producció que no modela la incertesa i es vol modelar la incertesa a posteriori de forma independent. Aquest problema es denotarà com la modelització de la incertesa d'un sistema de caixa negra i motivarà la proposta de nous models especialitzats en mantenir els avantatges predictius, com ara la Regressió Quantílica (RQ), adaptant-los al problema de la caixa negra. Posteriorment, la investigació en RQ motivarà la proposta de nous models per resoldre un problema fonamental de la literatura en RQ conegut com el fenomen del creuament de quantils, que apareix quan, a l’hora de predir simultàniament diferents quantils, l’ordre entre quantils no es conserva. Finalment, tota la investigació anterior es resumirà en mètodes de visualització i avaluació de la incertesa reportada per tal de produir mètodes que mitjançant aquesta informació extra prenguin decisions més robustes

    Estudi de les xarxes neuronals convolucionals profundes mitjançant Caffe

    Get PDF
    Treballs Finals de Grau d'Enginyeria Informàtica, Facultat de Matemàtiques, Universitat de Barcelona, Any: 2015, Director: Jordi Vitrià i MarcaThe deep learning techniques and the use of GPUs have made neural networks the leading option for solving some computational problems and it has been shown to produce the state-of-the-art results in many fields like computer vision, automatic speech recognition, natural language processing, and audio recognition. This final grade dissertation is divided into three parts: first, we will discuss theoretical concepts that allow us to understand how neural networks work. Second, we will focus in deep convolutional neural network to understand and learn how to build such networks that Caffe Framework use. Finally, the third part will be a compilation of all learned skills to make that a neural network will classify correctly the data set MNIST, and then we will change Caffe Framework files so it can read images with more channels and we will watch the results obtained. At last, we will match and improve the public state-of-the-art classification system of the data set Food-101. After all the work, our goals will be achieved: we will modify Caffe Framework and we will check that in the case of the MNIST we will improves the classification rate. But above all, the result that we want to emphasize is the to release of classification system for the Food-101 that will improve the accuracy from 56.40% to 74.6834% and finally we will propose ideas for improving this classification in the future

    Retrospective Uncertainties for Deep Models using Vine Copulas

    Get PDF
    Despite the major progress of deep models as learning machines, uncertainty estimation remains a major challenge. Existing solutions rely on modified loss functions or architectural changes. We propose to compensate for the lack of built-in uncertainty estimates by supplementing any network, retrospectively, with a subsequent vine copula model, in an overall compound we call Vine-Copula Neural Network (VCNN). Through synthetic and real-data experiments, we show that VCNNs could be task (regression/classification) and architecture (recurrent, fully connected) agnostic while providing reliable and better-calibrated uncertainty estimates, comparable to state-of-the-art built-in uncertainty solutions.The research leading to these results has received funding from the Horizon Europe Programme under the SAFEXPLAIN Project (www.safexplain.eu), grant agreement num. 101069595 and the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 772773). Additionally, this work has been partially supported by Grant PID2019-107255GB-C21 funded by MCIN/AEI/ 10.13039/501100011033.Peer ReviewedPostprint (published version

    Building uncertainty models on top of black-box predictive APIs

    Get PDF
    With the commoditization of machine learning, more and more off-the-shelf models are available as part of code libraries or cloud services. Typically, data scientists and other users apply these models as ''black boxes'' within larger projects. In the case of regressing a scalar quantity, such APIs typically offer a predict() function, which outputs the estimated target variable (often referred to as y¿ or, in code, y_hat). However, many real-world problems may require some sort of deviation interval or uncertainty score rather than a single point-wise estimate. In other words, a mechanism is needed with which to answer the question ''How confident is the system about that prediction?'' Motivated by the lack of this characteristic in most predictive APIs designed for regression purposes, we propose a method that adds an uncertainty score to every black-box prediction. Since the underlying model is not accessible, and therefore standard Bayesian approaches are not applicable, we adopt an empirical approach and fit an uncertainty model using a labelled dataset (x, y) and the outputs y¿ of the black box. In order to be able to use any predictive system as a black box and adapt to its complex behaviours, we propose three variants of an uncertainty model based on deep networks. The first adds a heteroscedastic noise component to the black-box output, the second predicts the residuals of the black box, and the third performs quantile regression using deep networks. Experiments using real financial data that contain an in-production black-box system and two public datasets (energy forecasting and biology responses) illustrate and quantify how uncertainty scores can be added to black-box outputs

    Convolutional neural networks and probabilistic models

    No full text
    The deep learning techniques have made neural networks the leading option for solving some computational problems and it has been shown the production of the state-of-the-art results in many fields like computer vision, automatic speech recognition, natural language processing, and audio recognition. In fact, we may be tempted to make use of neural networks directly, as we know them nowadays, in order to make predictions and solve many problems, but if the decision that has to be taken is of high risk. For instance, we could have a problem regarding the control of a nuclear power plant or the prediction of the shares evolution in the market; in this case, it would be important to look for methods that allowed us to add more information concerning the certainty of those predictions. This Master’s thesis is divided into three parts: Firstly, we will analyse the state-of-the-art regarding Mixture Density Network models to predict an entire probability distribution for the output and we will develop an implementation to give solutions for many of the numerical stability problems that characterise this type of models. Secondly, in order to propose an initial solution for the uncertainty problems introduced above, we will focus on the extraction of a confidence factor by using neural network outputs of a problem for which we are only interested in the prediction of something if we have a minimum certainty about the prediction we made. In order to do it, we will compile the current literature methods to measure uncertainty through Mixture Density Networks and we will implement all of these works. Consequently, we are going to to go into detail about the concept of uncertainty and we will see to what extent we are able to propose a solution by using neural network models for the different aspects that include such concept. Finally, the third part will refer to several proposals to measure the confidence factor obtained with the use of Mixture Density Network concerning the problem proposed. After all the work, our goals will be achieved: we are going to make a stable implementation for all the problems that we have proposed for Mixture Density Networks and we will publish it publicly in our GitHub repository[9]. We will be able to implement the state-of-theart methods that will allow us to obtain a confidence factor and finally we will be able to propose a method that obtains the expected results regarding the parameters that represent the confidence factor

    Convolutional neural networks and probabilistic models

    No full text
    The deep learning techniques have made neural networks the leading option for solving some computational problems and it has been shown the production of the state-of-the-art results in many fields like computer vision, automatic speech recognition, natural language processing, and audio recognition. In fact, we may be tempted to make use of neural networks directly, as we know them nowadays, in order to make predictions and solve many problems, but if the decision that has to be taken is of high risk. For instance, we could have a problem regarding the control of a nuclear power plant or the prediction of the shares evolution in the market; in this case, it would be important to look for methods that allowed us to add more information concerning the certainty of those predictions. This Master’s thesis is divided into three parts: Firstly, we will analyse the state-of-the-art regarding Mixture Density Network models to predict an entire probability distribution for the output and we will develop an implementation to give solutions for many of the numerical stability problems that characterise this type of models. Secondly, in order to propose an initial solution for the uncertainty problems introduced above, we will focus on the extraction of a confidence factor by using neural network outputs of a problem for which we are only interested in the prediction of something if we have a minimum certainty about the prediction we made. In order to do it, we will compile the current literature methods to measure uncertainty through Mixture Density Networks and we will implement all of these works. Consequently, we are going to to go into detail about the concept of uncertainty and we will see to what extent we are able to propose a solution by using neural network models for the different aspects that include such concept. Finally, the third part will refer to several proposals to measure the confidence factor obtained with the use of Mixture Density Network concerning the problem proposed. After all the work, our goals will be achieved: we are going to make a stable implementation for all the problems that we have proposed for Mixture Density Networks and we will publish it publicly in our GitHub repository[9]. We will be able to implement the state-of-theart methods that will allow us to obtain a confidence factor and finally we will be able to propose a method that obtains the expected results regarding the parameters that represent the confidence factor

    Main sources of variability and non-determinism in AD software: taxonomy and prospects to handle them

    No full text
    Safety standards in domains like automotive and avionics seek for deterministic execution (lack of jittery behavior) as a stepping stone to build a certification argument on the correct timing behavior of the system. However, the use of artificial-intelligence (AI) software in safety-critical systems carries several built-in and derivative sources of non-determinism that are at odds with safety standard determinism requirements. In this work we analyze the main sources of non-determinism of autonomous driving (AD) software, as highly representative and compelling example of the use of AI software, deep neural networks (DNN) in particular, in critical embedded systems. Paradoxically, DNN-based software in its inference phase—once the NN structure and weights have been fixed—turns out to consist mainly in matrix multiplications, which are inherently quite time deterministic. Our work focuses on sources of variability and non-determinism in AD software, covering algorithmic elements of AD software, low-level software and hardware computing platform, and data-flow constraints among AD modules. As final contribution of our work, which mainly focuses on problem identification, we develop some prospects on the information and metrics needed to better understand and control the unpredictability and non-determinism of AD software.This work has been supported by the Spanish Ministry of Science and Innovation under grant PID2019-107255GBC21/AEI/10.13039/501100011033, and the European Research Council (ERC) grant agreement No. 772773 (SuPerCom).Peer ReviewedPostprint (author's final draft
    corecore